首页> 外文OA文献 >Articulatory Control of HMM-based Parametric Speech Synthesis using Feature-Space-Switched Multiple Regression
【2h】

Articulatory Control of HMM-based Parametric Speech Synthesis using Feature-Space-Switched Multiple Regression

机译:基于特征空间切换多元回归的Hmm参数语音合成的发音控制

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

In previous work we proposed a method to control the characteristics of synthetic speech flexibly by integrating articulatory features into a hidden Markov model (HMM) based parametric speech synthesiser. In this method, a unified acoustic-articulatory model is trained, and context-dependent linear transforms are used to model the dependency between the two feature streams. In this paper, we go significantly further and propose a feature-space-switched multiple regression HMM to improve the performance of articulatory control. A multiple regression HMM (MRHMM) is adopted to model the distribution of acoustic features, with articulatory features used as exogenous "explanatory variables". A separate Gaussian mixture model (GMM) is introduced to model the articulatory space, and articulatory-to-acoustic regression matrices are trained for each component of this GMM, instead of for the context-dependent states in the HMM. Furthermore, we propose a task-specific context feature tailoring method to ensure compatibility between state context features and articulatory features that are manipulated at synthesis time. The proposed method is evaluated on two tasks, using a speech database with acoustic waveforms and articulatory movements recorded in parallel by electromagnetic articulography (EMA). In a vowel identity modification task, the new method achieves better performance when reconstructing target vowels by varying articulatory inputs than our previous approach. A second vowel creation task shows our new method is highly effective at producing a new vowel from appropriate articulatory representations which, even though no acoustic samples for this vowel are present in the training data, is shown to sound highly natural.
机译:在先前的工作中,我们提出了一种通过将发音特征集成到基于隐马尔可夫模型(HMM)的参量语音合成器中来灵活控制合成语音特性的方法。在这种方法中,训练了统一的声音表达模型,并且使用了上下文相关的线性变换来对两个特征流之间的相关性进行建模。在本文中,我们将走得更远,并提出一种功能空间切换的多元回归HMM,以改善关节控制的性能。采用多元回归HMM(MRHMM)对声学特征的分布进行建模,并将发音特征用作外生“解释变量”。引入了一个单独的高斯混合模型(GMM)对关节空间进行建模,并且针对该GMM的每个组件(而不是针对HMM中的上下文相关状态)训练了关节到声音的回归矩阵。此外,我们提出了一种特定于任务的上下文特征定制方法,以确保状态上下文特征和在合成时进行操作的发音特征之间的兼容性。所提出的方法在两个任务上进行了评估,即使用语音数据库,该数据库具有声波形和通过电磁关节造影(EMA)并行记录的关节运动。在元音标识修改任务中,与以前的方法相比,通过更改发音连接来重建目标元音时,该新方法可获得更好的性能。第二个元音创建任务表明,我们的新方法非常有效,可以通过适当的发音表示来制作新的元音,即使在训练数据中没有该元音的声音样本,该方法听起来也很自然。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号